Goto

Collaborating Authors

 anomaly segmentation




MetaUAS: Universal Anomaly Segmentation with One-Prompt Meta-Learning

Neural Information Processing Systems

Zero-and few-shot visual anomaly segmentation relies on powerful vision-language models that detect unseen anomalies using manually designed textual prompts. However, visual representations are inherently independent of language. In this paper, we explore the potential of a pure visual foundation model as an alternative to widely used vision-language models for universal visual anomaly segmentation.We present a novel paradigm that unifies anomaly segmentation into change segmentation. This paradigm enables us to leverage large-scale synthetic image pairs, featuring object-level and local region changes, derived from existing image datasets, which are independent of target anomaly datasets. We propose a one-prompt Meta-learning framework for Universal Anomaly Segmentation (MetaUAS) that is trained on this synthetic dataset and then generalizes well to segment any novel or unseen visual anomalies in the real world. To handle geometrical variations between prompt and query images, we propose a soft feature alignment module that bridges paired-image change perception and single-image semantic segmentation. This is the first work to achieve universal anomaly segmentation using a pure vision model without relying on special anomaly detection datasets and pre-trained visual-language models. Our method effectively and efficiently segments any anomalies with only one normal image prompt and enjoys training-free without guidance from language. Our MetaUAS significantly outperforms previous zero-shot, few-shot, and even full-shot anomaly segmentation methods.


IterMask3D: Unsupervised Anomaly Detection and Segmentation with Test-Time Iterative Mask Refinement in 3D Brain MR

Liang, Ziyun, Guo, Xiaoqing, Xu, Wentian, Ibrahim, Yasin, Voets, Natalie, Pretorius, Pieter M, Noble, J. Alison, Kamnitsas, Konstantinos

arXiv.org Artificial Intelligence

Unsupervised anomaly detection and segmentation methods train a model to learn the training distribution as'normal'. In the testing phase, they identify patterns that deviate from this normal distribution as'anomalies'. To learn the'normal' distribution, prevailing methods corrupt the images and train a model to reconstruct them. During testing, the model attempts to reconstruct corrupted inputs based on the learned'normal' distribution. Deviations from this distribution lead to high reconstruction errors, which indicate potential anomalies. However, corrupting an input image inevitably causes information loss even in normal regions, leading to suboptimal reconstruction and an increased risk of false positives. To alleviate this, we propose IterMask3D, an iterative spatial mask-refining strategy designed for 3D brain MRI. We iteratively spatially mask areas of the image as corruption and reconstruct them, then shrink the mask based on reconstruction error. This process iteratively unmasks'normal' areas to the model, whose information further guides reconstruction of'normal' patterns under the mask to be reconstructed accurately, reducing false positives. In addition, to achieve better reconstruction performance, we also propose using high-frequency image content as additional structural information to guide the reconstruction of the masked area. Extensive experiments on the detection of both synthetic and real-world imaging artifacts, as well as segmentation of various pathological lesions across multiple MRI sequences, consistently demonstrate the effectiveness of our proposed method. Introduction Segmenting anomalies is crucial in the field of medical image analysis as it enables applications such as early disease detection and diagnosis, guides treatment planning, and reduces clinical workload. Conventional anomaly segmentation methods are mostly supervised, relying on annotated training data, where images contain anomalies with corresponding manual labels. Trained on a limited set of data types, the model can only segment anomalies resembling those in its training data, and struggles to detect other types of unseen anomalies. In the context of brain MRI images, the focus of this study, this type of methods typically target a specific pathology (Kamnitsas et al., 2017; Isensee et al., 2021; Stollenga et al., 2015). Unsupervised anomaly segmentation, on the other hand, does not require any'anomalous' images or their manual segmentations during training. Instead, it is trained exclusively on'normal' images and treats this training distribution as the'normal' reference. During testing, the method regards any deviation from this reference as'anomaly' and attempts to segment it.




RDDPM: Robust Denoising Diffusion Probabilistic Model for Unsupervised Anomaly Segmentation

Moradi, Mehrdad, Paynabar, Kamran

arXiv.org Machine Learning

Recent advancements in diffusion models have demonstrated significant success in unsupervised anomaly segmentation. For anomaly segmentation, these models are first trained on normal data; then, an anomalous image is noised to an intermediate step, and the normal image is reconstructed through backward diffusion. Unlike traditional statistical methods, diffusion models do not rely on specific assumptions about the data or target anomalies, making them versatile for use across different domains. However, diffusion models typically assume access to normal data for training, limiting their applicability in realistic settings. In this paper, we propose novel robust denoising diffusion models for scenarios where only contaminated (i.e., a mix of normal and anomalous) unlabeled data is available. By casting maximum likelihood estimation of the data as a nonlinear regression problem, we reinterpret the denoising diffusion probabilistic model through a regression lens. Using robust regression, we derive a robust version of denoising diffusion probabilistic models. Our novel framework offers flexibility in constructing various robust diffusion models. Our experiments show that our approach outperforms current state of the art diffusion models, for unsupervised anomaly segmentation when only contaminated data is available. Our method outperforms existing diffusion-based approaches, achieving up to 8.08\% higher AUROC and 10.37\% higher AUPRC on MVTec datasets. The implementation code is available at: https://github.com/mehrdadmoradi124/RDDPM


AdaptCLIP: Adapting CLIP for Universal Visual Anomaly Detection

Gao, Bin-Bin, Zhou, Yue, Yan, Jiangtao, Cai, Yuezhi, Zhang, Weixi, Wang, Meng, Liu, Jun, Liu, Yong, Wang, Lei, Wang, Chengjie

arXiv.org Artificial Intelligence

Universal visual anomaly detection aims to identify anomalies from novel or unseen vision domains without additional fine-tuning, which is critical in open scenarios. Recent studies have demonstrated that pre-trained vision-language models like CLIP exhibit strong generalization with just zero or a few normal images. However, existing methods struggle with designing prompt templates, complex token interactions, or requiring additional fine-tuning, resulting in limited flexibility. In this work, we present a simple yet effective method called AdaptCLIP based on two key insights. First, adaptive visual and textual representations should be learned alternately rather than jointly. Second, comparative learning between query and normal image prompt should incorporate both contextual and aligned residual features, rather than relying solely on residual features. AdaptCLIP treats CLIP models as a foundational service, adding only three simple adapters, visual adapter, textual adapter, and prompt-query adapter, at its input or output ends. AdaptCLIP supports zero-/few-shot generalization across domains and possesses a training-free manner on target domains once trained on a base dataset. AdaptCLIP achieves state-of-the-art performance on 12 anomaly detection benchmarks from industrial and medical domains, significantly outperforming existing competitive methods. We will make the code and model of AdaptCLIP available at https://github.com/gaobb/AdaptCLIP.


Learning to Detect Multi-class Anomalies with Just One Normal Image Prompt

Gao, Bin-Bin

arXiv.org Artificial Intelligence

Unsupervised reconstruction networks using self-attention transformers have achieved state-of-the-art performance for multi-class (unified) anomaly detection with a single model. However, these self-attention reconstruction models primarily operate on target features, which may result in perfect reconstruction for both normal and anomaly features due to high consistency with context, leading to failure in detecting anomalies. Additionally, these models often produce inaccurate anomaly segmentation due to performing reconstruction in a low spatial resolution latent space. To enable reconstruction models enjoying high efficiency while enhancing their generalization for unified anomaly detection, we propose a simple yet effective method that reconstructs normal features and restores anomaly features with just One Normal Image Prompt (OneNIP). In contrast to previous work, OneNIP allows for the first time to reconstruct or restore anomalies with just one normal image prompt, effectively boosting unified anomaly detection performance. Furthermore, we propose a supervised refiner that regresses reconstruction errors by using both real normal and synthesized anomalous images, which significantly improves pixel-level anomaly segmentation. OneNIP outperforms previous methods on three industry anomaly detection benchmarks: MVTec, BTAD, and VisA. The code and pre-trained models are available at https://github.com/gaobb/OneNIP.


Zero-Shot Industrial Anomaly Segmentation with Image-Aware Prompt Generation

Park, SoYoung, Lee, Hyewon, Choi, Mingyu, Han, Seunghoon, Lee, Jong-Ryul, Lim, Sungsu, Kim, Tae-Ho

arXiv.org Artificial Intelligence

Anomaly segmentation is essential for industrial quality, maintenance, and stability. Existing text-guided zero-shot anomaly segmentation models are effective but rely on fixed prompts, limiting adaptability in diverse industrial scenarios. This highlights the need for flexible, context-aware prompting strategies. We propose Image-Aware Prompt Anomaly Segmentation (IAP-AS), which enhances anomaly segmentation by generating dynamic, context-aware prompts using an image tagging model and a large language model (LLM). IAP-AS extracts object attributes from images to generate context-aware prompts, improving adaptability and generalization in dynamic and unstructured industrial environments. In our experiments, IAP-AS improves the F1-max metric by up to 10%, demonstrating superior adaptability and generalization.